<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">

<HTML>

<HEAD>
  <TITLE>Elkhound FAQ</TITLE>
  <meta http-equiv="Content-Type" content="text/html; charset=US-ASCII">
  <style type="text/css">
    H1 { font-size: 150% }
    H2 { font-size: 125% }
    H3 { font-size: 100% }
    P.title { font-size: 175% }
    a.toc:link { text-decoration: none }
    a.toc:visited { text-decoration: none }
  </style>
</HEAD>

<body>

<center>
<p class="title"><b>Elkhound Frequently Asked Questions</b></p>
</center>

<p>
This page has answers to questions that I've received and think
might be useful generally.

<h1>Contents</h1>
                          
<p>
<!-- BEGIN CONTENTS -->
<!-- automatically generated by insert-html-toc; do not edit the TOC directly -->
<ul>
  <li><a class="toc" href="#reductions">1. Reduction Actions</a>
  <ul>
    <li><a class="toc" href="#execute_pruned_reductions">1.1 Does Elkhound ever run action routines on parses that are not part of a complete parse tree?</a>
    <li><a class="toc" href="#pruned_reduction_exception">1.2 What if a reduction has a side effect?</a>
  </ul>
  <li><a class="toc" href="#starting_parser">2. Starting the Parser</a>
  <ul>
    <li><a class="toc" href="#other_start_symbols">2.1 How do I start parsing from a symbol other than the start symbol?</a>
  </ul>
</ul>
<!-- END CONTENTS -->

<a name="reductions"></a>
<h1>1. Reduction Actions</h1>

<a name="execute_pruned_reductions"></a>
<h2>1.1 Does Elkhound ever run action routines on parses that are not part of a complete parse tree?</h2>
<!-- originally from Carl Eastlund -->

<p>
Yes, it does.  The question of whether a given reduction can be part of a
complete parse tree can, in general, only be answered by seeing all of the
input.  So the two choices are to reduce as you go, which is what Elkhound
does, or to build a parse tree/forest and then run all reductions after
parsing is complete.  Elkhound does not do the latter because building a parse
tree slows parsing by a large (~10) constant factor.

<p>
In certain circumstances, Elkhound can determine that an action corresponds
to a parse that is going to fail before processing of the current token
is finished (see <a href="glr.cc">glr.cc</a>, <tt>GLR::canMakeProgress()</tt>).
In such cases, the action is preempted, but this is just a quick hack that
can slightly improve performance.

<p>
Therefore, unless the grammar is LALR(1) (i.e., there are no
conflicts), you must assume that any reduction action might be part of
a parse that will fail.  If the reduction action has effects that need
to be undone, you can put the undo actions in <tt>del()</tt> methods.

<a name="pruned_reduction_exception"></a>
<h2>1.2 What if a reduction has a side effect?</h2>
<!-- originally from Carl Eastlund -->

<p>
Original question:
<blockquote>
  We have a scenario where an action routine is run on part of an
  expression - this part is legal on its own, but the subdivision is
  illegal.  The problem is that the action routine raises an exception
  due to context-sensitive invariants.
</blockquote>

<p>
This is an instance of a more general problem: actions that have side
effects.  Elkhound will call the <tt>del</tt> function corresponding to
reductions that are performed and then discarded, but if the action had a
side effect then undoing it in <tt>del</tt> can be very difficult.  Therefore you
have to be careful about using side effects.  Throwing an exception is
essentially an extreme form of side effect from Elkhound's point of view.

<p>
Probably the easiest solution is to delay the activity that leads to
throwing the exception.  The reduction action can build a record of the
reduction, but then postpone further processing until another pass, after
parsing is complete.  As long as this doesn't happen too often, you won't
incur the cost of a complete parse tree.

<p>
Another possibility is to examine the parse tables and see if a grammar
modification can eliminate the nondeterminism that leads to the parse
split.  This of course requires detailed understanding of how the parsing
algorithm works; it is probably only worth it if performance is crucial.


<a name="starting_parser"></a>
<h1>2. Starting the Parser</h1>

<a name="other_start_symbols"></a>
<h2>2.1 How do I start parsing from a symbol other than the start symbol?</h2>
<!-- originally from Bill McCloskey, also asked by Carl Eastlund -->

<p>
Sometimes people want to use a single Elkhound specification, but have
it start parsing with different symbols in different situations.
Eventually I'd like to build in a feature to do this automatically,
but in the meantime there is a trick that will work.

<p>
Suppose you have an Elkhound grammar with several nonterminals
you want to be able to use as start symbols:
<pre>
  nonterm(Foo*) Foo { -&gt; A B C {/*...*/} /*...*/ }
  nonterm(Bar*) Bar { /*...*/ }
  nonterm(Baz*) Baz { /*...*/ }
</pre>

<p>
You can create a wrapper nonterminal, which (when put first in the
grammar file) Elkhound will regard as the "real" start symbol, and
give it rules whose first symbol is a special token that selects the
start symbol:
<pre>
  nonterm(void*) WrapperStart {
    -&gt; TOK_SELECT_FOO f:Foo { return f; }
    -&gt; TOK_SELECT_BAR b:Bar { return b; }
    -&gt; TOK_SELECT_BAZ b:Baz { return b; }
  }
</pre>

<p>
Now, these special tokens (e.g. <tt>TOK_SELECT_FOO</tt>) are declared
like ordinary tokens but have the property that they are never yielded
by the lexer.  Instead, you manually provide them as the first token
when initializing the parser.  This is done instead of the usual
"priming step"; the parser engine will read the first real token once
it drops into its normal parsing loop:
<pre>
  Lexer lexer;                   // your implementation of LexerInterface
  lexer.type = TOK_SELECT_FOO;   // you pick what to provide here
  SemanticValue treeTop;
  glr.glrParse(lexer, treeTop);
  Foo *foo = (Foo*)treeTop;      // then interpret the result accordingly
</pre>

<p>
Above, I've used <tt>void*</tt> for convenience.  If you're bothered
by the lack of type safety, or using the OCaml version, you can use
some kind of tagged union type.  The union only has to carry the
final parse result from the WrapperStart symbol out to the caller,
so it is short lived and irrelevant to performance.




</body>
</html>
